Efficient Algorithms on Sequence Binary Decision Diagrams for Manipulating Sets of Strings

نویسندگان

  • Shuhei Denzumi
  • Ryo Yoshinaka
  • Shin-ichi Minato
  • Hiroki Arimura
چکیده

We consider sequence binary decision diagrams (sequence BDD or SDD, for short), which are compact representation for manipulating sets of strings, proposed by (Loekito, et al., Knowl. Inf. Syst., 24(2), 235-268, 2009). An SDD resembles to an acyclic DFA in binary form with different reduction rules from one for DFAs. In this paper, we study the power of SDDs for storing and manipulating sets of strings on shared and reduced SDDs. Particularly, we first give the characterization of minimal SDDs as reduced SDDs. Then, we present simple and efficient algorithms for various problems related to reduced and shared SDDs: on-the-fly and off-line minimization, dynamic string set construction, and factor SDD construction. Finally, we run experiments on real data sets that show the efficiency and usefulness of SDDs in large-scale string processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studies on Decision Diagrams for Efficient Manipulation of Sets and Strings

In many real-life problems, we are often faced with manipulating discrete structures. Manipulation of large discrete structures is one of the most important problems in computer science. For this purpose, a family of data structures called decision diagrams is used. The origin of the decision diagrams is binary decision diagram (BDD) proposed by Bryant in 1980s. BDD is a data structure to repre...

متن کامل

Building Substring Indices Using Sequence BDDs

(Abstract) There is a demand for efficient indexed-substring data structures, which can store all substrings of a given text. Suffix trees and Directed Acyclic Word Graphs (DAWGs) are examples of substring indices, but they lack operations for manipulating sets of strings. The Sequence Binary Decision Diagram (SeqBDD) data structure proposed) is a new type of Binary Decision Diagram (BDD), and ...

متن کامل

Suffix-DDs: Substring Indices Based on Sequence BDDs for Constrained Sequence Mining

In this paper, we study an efficient index structure, called Suffix Decision Diagrams (SuffixDDs), for knowledge discovery in large sequence data. Recently, Loekito, Bailey, and Pei (KAIS, 2009) proposed a new data structure for sequence data, called Sequence Binary Decision Diagram (SeqBDD), which is an extension of Zero-suppressed Binary Decision Diagrams (ZDDs) for sequences. SuffixDD is a c...

متن کامل

Notes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations

Manipulation of large sequence data is one of the most important problems in string processing. Recently, Loekito et al. (Knowl. Inf. Syst., 24(2), 235-268, 2009) have introduced a new data structure, called Sequence Binary Decision Diagrams (SeqBDDs, or SDDs), which are descendants of both acyclic DFAs (ADFAs) and binary decision diagrams (BDDs). SDDs can compactly represent sets of sequences ...

متن کامل

Graphillion: ZDD-based Software Library for Very Large Sets of Graphs

Graphillion is a library for manipulating very large sets of graphs, based on zero-suppressed binary decision diagrams (ZDDs) with advanced graph enumeration algorithms. Graphillion is implemented as a Python extension in C++, to encourage easy development of its applications without introducing significant performance overhead. Experimental results show that Graphillion allows us to manage an ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011